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VON NEUMANN’S INEQUALITY EOR TENSORS 
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Abstract. For two matrices in the von Neumann inequality 

says that their scalar product is less than or equal to the scalar product 
of their singular spectrum. In this short note, we extend this result to 
real tensors and provide a complete study of the equality case. 


1. Introduction 

The goal of this paper is to generalize von Neumann’s inequality from 
matrices to tensors. Consider two matrices X and Y in Denote 

their singular spectrum, i.e. the vector of their singular values, by cr{X) 
(resp. cr{Y)). The classical matrix von Neumann’s inequality [B] says that 

{X,Y) < {a{X),a{Y)), 

and equality is achieved if and only if X and Y have the same singular sub¬ 
spaces. Von Neumann’s inequality, and the characterization of the equality 
case in this inequality, are important in many aspects of mathematics. 

For tensors, the task of generalizing Von Neumann’s inequality is rendered 
harder because of the necessity to appropriately define the singular values 
and the Singular Value Decomposition(SVD). In this paper, we will use the 
SVD defined in [T], which is based on the Tucker decomposition. 

Our main result is given in Theorem 13 .1 1 below and gives a characterization 
of the equality case. We expect this result to be useful for the description 
of the subdifferential of some tensor fonctions as the matrix counterpart has 
proved for matrix functions [3]. Such functions occur naturally in computa¬ 
tional statistics, machine learning and numerical analysis mm due to the 
recent interest of sparsity promoting norms as a convex surrogate to rank 
penalization. 


2. Main facts about tensors 

Let D and rii,... ,nD be positive integers. Let X G '^riix---xnn {^e^ote 
a U-dimensional array of real numbers. We will also denote such arrays as 
tensors. 

2.1. Basic notations and operations. A subtensor of A is a tensor ob¬ 
tained by fixing some of its coordinates. As an example, fixing one coordinate 
id = k in X for some /c G {1,..., n^} yields a tensor in ]Rmx---xnd_ixnd+ix---xn£, 
In the sequel, we will denote this subtensor of X by Xi^=k. 

The fibers of a tensor are subtensors that have only one mode, i.e. obtained 
by fixing every coordinate except one. The mode-d fibers are the vectors 
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They extend the notion of columns and rows from the matrix to the tensor 
framework. For a matrix, the mode-1 fibers are the columns and the mode-2 
hbers are the rows. 

The mode-d matricization of X is obtained by forming the matrix 
whose columns are the mode-d fibers of the tensor, arranged in the lexico¬ 
graphic ordering j7]. Clearly, the fcth column of consists of the entries 
of 

The mode-d multiplication of a tensor X G '^nix — xno matrix U G 

denoted hy X XdU, gives a tensor in ]Rmx---xn^x---xno_ -g (jggggjj 
as 

2.2. Higher Order Singular Value Decomposition (HOSVD). The 

Tucker decomposition of a tensor is a very useful decomposition, which can be 
chosen so that with appropriate orthogonal transformations, one can reveal a 
tensor S hidden inside X with interesting rank and orthogonality properties. 
More precisely, we have 

(2.1) T = S{X)xiU^^^ ■■■ XdU^^\ 

where each G jg orthogonal and S{X) is a tensor of the same size 

as X defined as follows. Moreover, subtensors for k = l,...,nrf 

are all orthogonal to each other for each d = 1,..., D. 

2.2.1. Relationship with matricization. A tensor can be matricized along 
each of its modes. Let (8) denote the standard Kronecker product for ma¬ 
trices. Then, the mode-d matricization of a tensor X G JR^-ix-'-xno jg gjygj^ 

by 

{2X}d) = •5(T)(d) • . 

Take the (usual) SVD of the matrix X(^d) 

x^d) = 

and based on (|2.2p . we can set 

5(T)(rf) = ^ 

where S{X)(^d) is the mode-d matricization of S{X). One proceeds similarly 
for all d = 1,... ,D and one recovers the orthogonal matrices ..., 
which allow us to decompose X as in (12.ip . 

2.2.2. The spectrum. The mode-d spectrum is defined as the vector of singu¬ 

lar values of X(^d:) we will denote it by a^'^^X). Notice that this construc¬ 
tion implies that S{X) has orthonormal fibers for every modes. With a slight 
abuse of notation, we will denote by a the mapping which to each tensor X 
assigns the vector \ly/T) ..., of all mode-d singular spectra. 
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Figure 1. A block-wise diagonal tensor. 

3. Main Result 

3.1. The main theorem. The main result of this paper is the following 
theorem. 

Theorem 3.1. Let X ,y £ be tensors. Then for all d = 1,... ,D, 

we have 

(3.3) { y , y ) 

The equality in \S. ,11) holds simultaneously for all d = 1 , ... ,D if and only 
there exist orthogonal matrices E d = 1 ,.. ., D and tensors 

V{X),V{y) E such that 

T = V{X) XiW^^'> ■■■ XdW^^\ 
y = v{y) xiW^^^ ■■■ xdW^^\ 

where T>[X) and Tfy) satisfy the following properties: 

• T>{X) and T>[y) are block-wise diagonal with the same number and 
size of blocks. 

• Let L be the number of blocks and {T>i{X)'\i=i^^^^^L (resp. {lli(T)}i=;,...,L 
be the blocks on the diagonal of'D{X) (resp. T>{y)). Then for each 

/ = 1,... , L, the two blocks T>i{X) and T>i[y) are proportional. 

3.2. Proof of the main theorem. In this section, we prove Theorem 13.11 
If T or T is a zero tensor, then the result is trivial. In the sequel, we assume 
that both X and y are non-zero tensors. 

3.2.1. The "i/" part. The "if" part of the result is straightforward. Notice 
that {X,y) = {'D{X),'D{y)) and the singular vectors of X (resp. y) are 
equal to those of 'D{X) (resp. 'D{y)). Therefore, it remains to prove that 

(3.4) (P(T),P(T)) = (iT^")(P(T)),a('')(P(T))), d=l,...,D. 
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The conditions that ©(T) and T’(T) are block-wise diagonal and that T>i{X) 
and Pj(T) are proportional implies that each row of and that of 

'^(d){y) are parallel. It then follows that and have the same 

left and right singular vectors. Then, applying the matrix von Neumann’s 
result immediately gives (13.411 . 


3.2.2. The "only if" part: first step. Assume that 

{X,y) = d=l,...,D. 

By the classical results of matrix von Neumann’s inequality, we know that 
the equality holds if and only if there exist orthogonal matrices and 
such that 

(3.5) = [/('’^^Diag and 

T(d) = C/('')Diag(a('')(T))F(^)* 

for all d = 1,..., D. From this remark, we obtain the following HOSVD of 
T and T: 


S{X) Xi 7/(1) 


5(T) XiT/W.. 

• xdU^^\ 


3.2.3. The "only if" part: second step. We now show that subtensors 
and S{y)i^=k must be parallel for all fc = 1,..., and d = 1,..., D. 

Comparing (|3.5p with (12.2p . we deduce that 

iaf\x)-p\ 

(3.6) 5(,)(T) = : 

Vn] (y) ■ 

where pj denotes the ith row of matrix ^ . (g, jjiD) ^ jj{i) (g, ... (g, [/('^-i)). 

Similarly, we have 



(3.7) 


Sid){y) 


( (T) • Pi ^ 

vS (3^) • pij 


Comparing now (|3.6I1 and (13.7p reveals that the Th row of 5(£i)(T’) and the 
ith row of 5(rf)(T) must be proportional, for all i = 1,..., n^. Formally, this 
means 


(3.8) a\fiy) • = ajf (T) • S{yfi,...,,...i, 


for all possible values oi ii,... fin- 
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3.2.4. The "only if" part: third step. For d = 1,... ,D, let r^^ (resp. ry^^) 
be the rank of (resp. 5(^)(3^)). Let (ii,... ,i£)) be such that 

(ii,...,iD) t 

Then, 5(T)ii,...,iD = 0 but there exists d G {1,..., D} such that (t\'^{ y) > 0. 

Using (13.Sp . we obtain that = 0. Thus, if r^'^ > r^^ for some 

d = 1 ,... , D, then there exists some 


(ii,...,ZD) > {r^y\ ... ,r\P'>) 

such that 7 ^ 0 and thus, r^^ > r^^ for all d = By 

symmetry, we deduce that either > r^'^ for all d = 1 ,..., Z?, or < r'^'^ 
for all d = 1,..., Z) or else r^'^ = r^'^ for all d = 1,..., ZD. 

3.2.5. The "only if" part: fourth step. Assume that {r^\ ... < {r^\ ..., 

The other case may be treated in the same way (with an overlap in the 
equality case) by interchanging the role of A and y. For all {ii,... ,i£)) < 
{ry^\ ..., ry^^), we have cr-^\y) > 0 for all d = 1,..., ZD. Thus, (|3.8D gives 


We deduce from this equation that for two indices (zi,..., Z£)) and {i'l,... ,i'jj), 
if 


(i) there exists some d in {1,..., ZD} such that id = i'^, 

(ii) and are different from zero, 


then 

{S{y)ii...i^...ijy , 5(A’)j/^ ...,i£,) 


P ■ {'^iy)il---id-"iD ) Zdi'" Dd)’ 


where 


P = 


4\y) 


a. 


( D ) 

io 


a, 


(D) 

in 


(3^) 


> 0. 


3.2.6. The "only if" part: fifth step. Let pi > ■ ■ ■ > pi denote the possible 


values of the ratio it, 


(d), 


for all (ii,...,iz)) < (4"^ ..., 4"^^). 

Id,l, d = 1,...,ZD, I = l,...,Zy denote the possibly empty set of indices in 
{ 1 ,..., r^'^} such that 


di) 


Let 




a, 


m 


= Pi 


and let denote the cardinality of Id^r Then, for each d = 1,... ,ZD, we 
can find a permutation vr^ on { 1 ,..., n^j such that TTd{Id,i) = { 1 ; ■ ■ ■; 
T^dildp) = {md,i + 1 ,..., mdp + and so on and so forth. 
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Thus, for each mode d = 1,... ,D, there exists a permutation matrix 11^ 
such that the matrices 

P(T) = 5(T) xini--- xdHz), 

and 

viy) = 5(T) xini---xon^r. 

contain L blocks and each block in P(T) is proportionnal to the correspond¬ 
ing block in T>{y). Moreover, any entry with < 

{r^\ ..., and lying outside the union of these L blocks is null since if it 
were not, by (13.8p combined with ..., < {r^\ • • •, it would 

be proportionnal to a nonzero component with two different ra¬ 

tios, thus a contradiction. 

Finally, setting = 11^ [/('^) for d = 1,... ,D achieves the proof. 
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